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Figure 5(a) 
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DATA RBTRIBVAL SYSTEM 

The invention relates to a data retrieval system, in particular a 
system for retrieving hierarchically related objects from a relational 
dat2d>ase. 

This invention is described in relation to an LDAP (Lightweight 
Directory Access Protocol) directory server using a relational database 
as a backing store. This is a particularly appropriate example of 
hierarchical information which is nonetheless stored in a relational 
database. It will be seen from the following description, however, that 
the invention is not restricted to LDAP and is of use wherever 
hierarchically related objects (data) have to be stored and retrieved 
from a relational database. 

An LDAP directory of the Jcnown type comprises a collection of 
hierarchically related objects, an example of which is shown in Figure 1. 
The structure of the directory and content of its objects are typically 
determined by the contents of a schema object which is normally itself 
stored in the directory. The contents of this schema object comprise a 
set of object class definitions and a set of structural rules, as shown 
for the above example in Figure 2. The class definitions include a) a 
list of both mandatory (M) and optional (O) attributes for each object 
class allowed in the directory; and b) a list defining the hierarchical 
relationships between object classes and hence the inheritance rules for 
class definitions, in the above example, all object classes other than 
top are subclasses of the class top, thus inheriting the attribute 
obj ectClass, 

The structural rules control the arrangement of objects in the 
directory hierarchy and comprise a list of the allowed child object 
classes to each parent class and, for each such combination, the naming 
attribute (s) to be used to provide a unique relative distinguished name 
(RDN) for such an object. 

The relative distinguished name (RDN) provides a unique name for an 
object at that point in the directory hierarchy. Its format is thus 
somewhat unpredictable for any object, as it is formed by a combination 
of one or more of the object's attributes and as can be seen from the 
naming attribut s for employees, many different attributes may be used 
for an object at any one point in the tree, ldap objects also have a 
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unique name in the directory - the distinguished name (DN) . The DN is 
formed by the successive, sequential concatenation of the RDNs of the 
object itself and its parents, back up to the root of the directory tree. 

Thus, even in the simple case of Figure 1, the RDN for John Doe may 
be init=JDD*ID=:005047, and the DN may be orgName«lBM, siteNa«e«Arizona, 
init«JrD«D*005047; while the RDN for Jane Deer may in fact be 
emplName=Jane Deer and the DN may be orgName=IBM, siteName^Arizona. 
emplName-Jane Deer. 

The nature of the LDAP data model means that hierarchies may be 
varied and complex: similarly the naming scheme for objects as 
exemplified above also permits substantial variability. As a result, the 
schema definition itself for a directory does not provide a mechanism 
that can be easily adapted for storage and access of the directory 
contents. Consequently a data retrieval system is needed whereby objects 
can be assigned to a store with indexing to permit subsequent efficient 
search and retrieval. 

Accordingly, the present invention provides a data retrieval system 
as claimed in claim 1. 

Embodiments of the invention will now be described with reference 
to the accompanying drawings, in which: 

Figure 1 is a diagram illustrating a conventional LDAP directory; 
Figure 2 is a conventional schema for the LDAP directory of Figure 

1; 

Figure 3 illustrates some hierarchical data; 

Figures 4(a) s 4(b) are conventional index and data tables for the 
hierarchical data of Figure 3; 

Figure 5(a) & 5(b) are index and data tables for use in a data 
retrieval system according to a first embodiment of the invention; and 

Figure 6(a) s 6(b) are index and data tables for use in a data 
retrieval system according to a second embodiment of the invention. 



The most frequently used search and retrieval operations in the 
LDAP protocol, and for most hierarchical databases, are: 

1. Retrieval of the object from the specification of its DN. Using the 
data of Figure 1, for exan^le, retrieve the object orgNaao»Micro0o£t, 
prodManeBWindow83 . 1 . 

2. Search of the directory tree, from a parent object specified by DN, 
where a value for an attribute is specified, so as to retrieve a siibset 
of its immediate children. For example, again using Figure 1, search from 
orgName»lBM, 8iteName»Arizona for emplNaraeeJohn . 

3* As above, but to retrieve an object and any descendant of the 
object rather than be restricted to immediate children. For example, 
retrieve orgNanaelBM and all its descendent objects. This may be combined 
with further search criteria, for example, emplNamesJohn Doe. 

The precise details of the mapping of directory objects into tables 
is not part of this disclosure. Although in practice the objects will be 
mapped into several tables, it is simpler here to assume that all the 
objects are assigned to a single table. It is necessary, however, that 
each object is assigned a unique identifier. Although it is permissible 
to use the DN for this purpose, the DN is generally long and of varying 
length and therefore somewhat unsuitable. Thus, in the present 
embodiments, a unique numeric identifier is assigned to each object. 

Taking the data of Figure 3 for example, a conventional storage 
scheme which might normally be adopted is shown in Figures 4(a) and 4(b) . 
Two tables are involved; an index table. Figure 4(a) which maps between 
each object on (A,B,M.,.y) and unique identifier (1...7), and a data 
table which holds a row for each object. The data table also holds the 
unique identifer of the object's parent. The data table has columns for 
the various object attributes, including one for the unique object 
identifier, in summary: 

Index table (row for each object) : 
Object name (DN) 
Object identifier 

Data table (row for each object) 

Object attributes (one column for each) 



Object identifier 
Parent object identifier 



variations on this general theme may be used e.g. the parent 
identifier may be stored in the index table rather than in the data 
table. However all similar schemes have the characteristics that they 
support direct retrieval of an object whose name is Known, but do not 
provide for efficient search and/or retrieval for descendants explained 
L point 3 above. Por example, if the objects for the database o ..gure 
X are not stored in order in tables of the type shown in K.gure 4. then a 

earch for the descendants of XBM. could only retrieve descendants of the 
Iri.ona object stored sequentially after the location of the Arizona 

ct. Th search engine would then need to traverse the table aga.no 

find all Children of the Arizona object stored before the Arizona ob.ect 

in the table, and so on for these children. 

. retrieval system operating on such an index and data table must 

effectively navigate the hierarchy, thereby resulting in many seauent.al 
effectively na g ^^^^ implemented 

operations and traverses of the data table. 

1 a relational database this prohibits the use of the relational 
Iterators to conduct a full descendants search in a single traverse of 
the table. 

The data retrieval system according to the invention overcomes 
these limitations . 

por each oMect. l.s posiclon In .he nier.rchv be described bv 

.he =e,uential conctene.ion ot 1» own unioue In the direccory 
identifier with these of their .parents, teken in sequence. This 

! ction Of veiues wiii he termed the position .ev. Note that in he 
I... case, the position .ev would ,en,r,iix not be the sa»e " J; 
Which is for.ed by the concatenation of uniaue at that branch of the tree 

identifiers. 

in a first embodiment of the invention. Figures 5 ,a) and 5 lb) , the 
position key (pW, p«. p«. for an object is preferably stored as a set 
of values in a suitable nu^er of colu„.s assigned in the both the index 
and data tables (one colunm bein, required for each level in the 
directory tree hierarchy, . The advanta,e of the position .ey iS that the 
!: ect data .s now very easily searched on . hierarchical basis, thus i 
:„ descendants of a particular object are required, the position Key of 



Che parent can be used in a search condition on an object table (see the 
example below) . Moreover the key can also be used to control the level of 
searching, if only the immediate descendants are required then the search 
above can be further restricted by requiring that all other columns 
assigned to the position, save that for the level of the immediate 
descendant, be null, variations on this allow control of the depth of the 
search. 

using the tables of Figure 5 (a) and 5(b): 

1. A is found by a select on the data table where: identif ier»l . 

2. The immediate children of A are found by a select on the data table 
where: pkl«l & pk2fsnull 6 pkJsnull. 

3. All descendants of A are found by a select on the data table where: 
pklsl & pX2l«null. 

4. The children of M are found by a select on the data table where: 
pkl^l & pk2-3 & pk3!=null 

where I* designates the boolean operator "not equal". 

A second embodiment. Figures 6(a) & 6(b), overcomes the problem of 
the first embodiment taking significant storage in the index table. It 
will be seen that is not necessary to use all the columns of the position 
key to achieve the desired effect. Only the lowest column (i.e. that non- 
null column furthest from the root) is significant in identifying an 
object; the higlier level columns contain redundant data. Moreover the 
value in this column is the unique object identifier itself. 
Accordingly, in the index table. Figure 6(a), instead of adding the 
position key to the contents previously described it is sufficient to add 
a single column, entitled Level, containing the level of that object in 
the hierarchy (this indicates the position key column relevant to the 
unique identifier) . 

In the data table. Figure 6(b), given the information content of 
the position key, the two columns previously described that respectively 
contain the unique object and parent identifiers are no longer needed. 



The tables are now: 



index table (row for each obj 



Object name 

Object identifier 

Object level in directory tree 

Data table (row for each object) 

Object attributes (one column for each) 

position key identifier (one column for each level in the 
hierarchy) 

using the tables of Figures 6(a) & 6(b): 

1. A is found by a select on the index table where: identif ier-1. 

3. The i^ediate children of A are found by a select on the data table 
where: pkl=l & pk2-ln«ll * pk3=null. 

. A are found by a select on the data table where: 

3. All descendants of A are rouna uy 

pkl=l & pk2=!null. 

i: v^^, a «!elect on the data table where: 

4. The children of M are found by a select on 

plt2=3 & plc3=lnull. 

Thus, in comparison -l.h the co„ve„ci=..l ..bl.. of Fi^.r. 4 .he 
.„aex .ah.e o. .he secona e^oa..e„. =oh..ins one -"l^ ^Z ^lsT 
data cable of Che second emboaimenc conc.ins e.cra columns for two les 
than he n.-,,r of levels U che hierarchy. However search o.er.c.ons are 
1 reduce, co a si.Ple craverse of a che a.c. cable, .nsce.d of che 
complex sequence of selects previously requited. 



CLAIMS 



1. A data retrieval system in which a plurality of objects having a 
multi- level hierarchical relationship are stored, each object having a 
respective parent and a set of children, said system including an index 
table comprising a respective name and associated identifier for each 
object r and a data table comprising a respective set of attributes and a 
position key associated with each object in the system, each position key 
comprising a series of con^onents, each component corresponding to a 
level of the hierarchy, a first component of said key storing the 
identifier of an associated object, and each successive component storing 
the identifier of the parent of the object stored in the previous 
coir^nent* 

2. A data retrieval system as claimed in claim 1 in which said index 
table includes an attribute storing the respective level of each object 
in the system* 

3. A data retrieval system as claimed in claim 1 in which said index 
table comprises a position key associated with each object in the system, 
and said data table includes a respective identifier associated with each 
object in the system and a respective identifier of the parent of each 
object in the system. 

4. A method of retrieving an object name for an object stored in the 
data retrieval system claimed in claim 1 or 2, comprising the steps of: 

a. specifying an identifier of the object; and 

b. retrieving from the index table the name for the object whose 
identifier matches said specified identifier. 

5. A method of retrieving the immediate children of an object stored 
in the data retrieval system claimed in claim 2 comprising the steps of: 

a. specifying an identifier of the object; 

b. searching the index table to ascertain the hierarchical level of 
said object; 

c. selecting from the data table objects where the contents of the 
position key component for said level matches the identifier of the 
Ob j ect ; 

d. for said objects where the contents of the position key 
component for the hierarchical level below said level are not null 
and where the contents of the position key component for the next 
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hierarchical level below said level are null, retrieving the 
respective object identifiers. 

6. A method of retrieving all descendants of an object stored in a 
data retrieval system claimed in claim 2, comprising the steps of: 

a. specifying an identifier of the object; 

b. searching the index table to ascertain the hierarchical level of 
said object; 

c. selecting from the data table objects where the contents of the 
position key component for said level matches the identity of the 
object; 

d. for said objects where the contents of the position key 
component for the level below said level are not null, retrieving 
the respective object identifiers. 

7. A method of retrieving an 'object name for an object stored in the 
data retrieval system claimed in claim 3, comprising the steps of: 

a. specifying an identifier of the object; and 

b. retrieving from the data table the name for the object whose 
identifier matches said identifier. 



8. 



A method of retrieving the immediate children of an object stored 
in the data retrieval system claimed in claim 3 comprising the steps of: 

a. specifying an identifier of the object; and 

b. for objects where the contents of the parent identifier match 
the identifier of the object, retrieving the respective object 
identifiers . 

9. A method of retrieving all descendants of an object stored in a 
data retrieval system claimed in claim 3, comprising the steps of: 

a. specifying an identifier of the object: 

b. ascertaining the level of the object according to when a first 
component of said position key matches the identifier of said 
object; 

c. selecting from the data table objects where the contents of the 
position key component for said level matches the identity of the 
object; and 

d. for said objects where the contents of the position key 
component for the level below said level are not null, retrieving 
the respective object identifiers. 



10. A method as claimed in claims 5, 6 or 9 where steps c) and d) are 
carried out in a single traverse of said data table. 

11. A method as claimed in claims 5, 6, 8, 9 or 10 where the retrieved 
object identifiers are stored in the first component of respective 
position keys. 
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